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ABSTRACT 

A three-factor design was used to d 
effects of testing fregueiicies and feedback delays 
students' achievement in a beginning calculus cours 
test-period frequencies — daily quizzes (5-10 minute 
(20-30 minutes) , three midterm exams, or one midter 
Two feedback-delay- levels for test returns were set 
meeting or three-day delay on quizzes, and one week 
Subjects were blocked on ability level (SAT scores) 
third dimension of the design. A constructed achiev 
.78 reliability was used as a criterion measure; th 
multiple-choice. Data from an attitude measure and 
for differences in dropout proportions are reported 
results of the three-factor analysis of covariance. 
classes given short daily quizzes had higher achiev 
classes with the delay in feedback were significant 
class receiving results the next meeting. Ho attitu 
were found nor any interaction due to aptitude leve 
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Since classroom tests mdy be used not only to evaluate 
achievement, but also to teoch by communicating course objec- 
tives and structure, by forcing students to review, and by 
motivating more serious study, it may be expected that the 
number of tests Lu'ken during a course and the manner in which 
results of tests aro communicated to the students would have 
some effect on achievement. In a given subject and for a 
given level of student ability, one combination of test freq- 
uency may be more effective than another (Ammons, 1956; 
McKeachie, 1963). 

More frequent testing has generally been found to have 
a favorable effect in mathendtics classes (Schunert, 1951; 
Mach, 1963; Proger, 1968; Nystrom, 1969; Collins, 1971) and 
in psychology classes {Standlee and Popham, 1960; Feldhusen, 
1964). While there are studies in which no significant differ- 
ences were revealed (Curo, 1963; Selakovich, 1962), in no 
study has it been found that frequent tests have a detrimental 
effect. 

The simple principle that knowledge of results facilitates 
learning is one of the few generalizations clearly supported 
by research on college teaching (McKeachie, 1963). Knowledge 
of results affects motivation, rate of learning, and level 
reached by learning. For every task and every state of 
learning, there is probably an optimum delay for feedback of 
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results (Amnions, 1 956 ). There 1s some evidence (More, 1 969 ; 
English and Kinzer, 1956) that short delayed information 
feedback enhances retention of higher order learnings* 

The purpose of this study was to examine the effects on 
the achievement of beginning calculus students of four levels 
of test frequency and two levels of delay of test result feed- 
back, taking into consideration the mathematics aptitude of 
the students. 

Pro c e d u r e 

S^ubjec ts . Sixteen beginning analytic geome try and 
calculus classes during the Fall 1971 academic quarter at 
California State Polytechnic Collf;fje were in the study. In 
each class, three aptitude subgroups were .identified. Ten 
instructors taught the classes. Most of the 442 students in 
the classes were in their first year of college. Class sizes 
ranged from 11 to 35 with a moan of 28. 

Design . The design of the study was a two-by-four-by- 
three factorial design with two observations per cell. The 
three fixed factors were D - delay of feedback of test results 
(two levels), T = frequency of tests (four levels), and 
A = aptitude (three levels). Two of the sixteen classes were 
randomly assigned to each of eight test f requency ~f eedback 
delay treatments, with the provision that an instructor having 
two classes had them assigned to different treatments* The 
criterion was mathematics achievement. 
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V a r i a b 1 e s ♦ Mathematics achievement was measured by an 

achievement test constructed especially for this study. 

Concommitant variables were the mean class meeting time, and 

the mean class size. 'Other variables were the proportion of 

student withdrawals from class and the mean score on an 

attitude scale* 

Tr eatmen ts^. The four levels of frequency oi tests were: 

T^ : 5-10 minute quiz each day; one midterm exam. 

T^: 20-30 minute qu'z every fourth or fifth meeting; 
one midterm exam. 

T^: Three 30-50 minute midterm exams. 

T^; One midterm exam. 

The two levels of delay of feedback of test results were: 

D-j : All graded quiz and exam papers returned and discussed 
the next meeting after being taken. 

: Graded T^ quizzes returned and discussed three meetings 

after being taken; graded J ^ quizzes returned and 

discussed the meeting before the next quiz; all exams 
returned and discussed one week after being taken. 

All classes were administered the achievement test and the 
attitude scale at the end of the quarter. 

Aside from the experimental treatment combination assigned 
and the content coverage imposed by the textbook and course 
outline, each instructor was free to teach and evaluate his 
students as he wished. To avoid a possible "Hawthorne" effect, 
students were not informed of the stddy. 

Measuri n g instruments . The achievement test used as 
criterion measure was constructed as follows: A 59-1tem 



ERIC 



\ 



4 



multiple-choice test tailored to the textbook and content 
emphasis of the course was constructed and administered to 
160 third and fourth quarter calculus students. The test was 
item-analyzed, "bad" items rcMiioved , and a reliable and valid 
30- item test obtained. This test was administered in the last 
week of the quarter. The class mean of the students' scores 
was a measure of each classes' achievement. The mean KR-20 
reliability coefficient of the achievement test was .78, 

Scores on the mathematics portion of the College Entrance 
Examination Board Scholastic Aptitude Test were used to 
determine aptitude subgroups within each -class. The 33rd and 
67th percentiles of all bAT scores obtained were used to 
assign students in each class to high, middle, or low aptitude 
subgroups . 

The Purdue master attitude scale, A Scale to Measur e 
Attitude Toward Any Scho ol Subject (Remmers, 1960) was the 
instrument used to measure attitude. One of a series of 
Purdue University originated scales, it consists of a list of 
17 statements. Students are directed to indicate those statements 
with which they agree. The student's score is the median scale 
value of those statements with which he agrees. The class 
mean of the students' scores was taken as the measure of the 
classes ' a tti tude . 

Analysis . A three-factor analysis of covariance was 
used to test for differences among cell means on the achievement 
test. A chi-square test for differences in dropout proportions 
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was used. Two-factor analyses of variance were used to test 
for differences among the cell moans for each of the con- 
cofninitant variables and f or d i f f erences in attitude. 

Resul tj 

Analyses of variance revealed that classes assigned the 
p2 treatment met significantly later in the day with signifi- 
cantly fewer students than those in the treatment. Class 
time and size were accordingly used as covariates in the 
analyses of covariance for differences among cell means in 
mathematics achievement. 

It was determined that the d^ta provided by the achieve- 
ment test satisfied all the homogeniety and linearity assumptions 
underlying analysis of covariance ""or which it was appropriate 
to test. The adjusted cell niearis of the achievement test 
scores are shown in Table 1. 

The analysis of covariance revealed that only the F 
ratios for the mean effects of factors D, T, and A were signi- 
ficant. The highly significant F ratio for factor A (aptitude) 
implied that the aptitude grouping by Scholastic Aptitude Test 
was effective. The overall adjusted aptitude means were 
A^ = 55.63, A2 = 62.29, and A^ = 74.53. 

The significance of the F ratio for the main effect of 
factor T (frequency of tests) implied the existence of signi- 
ficant differences among the means of the four levels of that 
factor. To make comparisons between means, the Newman-Keuls 
procedure described by Winer (1962, p. 80) was followed. 
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The significance of the F ratio for the main effect of 
factor D (delay of test result feedback) implied the superior- 
ity of the D2 treatment, since the adjusted means for the 
treatments were = 59.07 and D2 = 69.23. 

A chi -square test for differences in dropout proportions 
yielded a computed value of }[q[^5^ 4.05. The tabular value for 
three degrees of freedom at the .05 level is X^05~ 7. 82. 
Thus no significant differences in dropout proportions among 
the classes in the ^ight treatment groups were found. 

An analysis of variance was used to test for differences 
in the mean scores on the attitude scale. The cell means for 
the attitude scale are given in Table 2. None of the F ratios 
obtained in the analysis was significant, thus no significant 
differences in attitude were found among the classes in the 
eight treatment groups. 
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7.4 7 7.54 7.76 7.70 7.62 

Classes to which short daily quizzes were assigned had 
higher achievement in calculus, as measured by the achievement 
test, than classes given other test frequency treatments; 
significantly higher than those given only a midterm exam. 
All other differences among the means for the four levels of 
frequency of tests were not significant. This study thus 
provides additional evidence for the effectiveness of short 
frequent tests on achievement. 

Classes in the test feedback delay treatment D2 
(long delay) had significantly higher achievement test 
adjusted mean scores than those in treatment (short delay). 
This result is interesting and somewhat surprising. The 
''optimum delay'' suggested by Ammons ( 1 956) apparently is 
longer than one day for college freshman calculus students. 
Such students are apparently able to mediate the two time- 
disconnected events of taking a test and getting back the 
test results over a longer period advantageously. The 
discussion of test results after an intervening period during 
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which other topics are studied apparently serves as an effective 
review and enhances learning, at least for freshman calculus 
students. 

The fact that there were no significant interaction 
effects revealed in the analysis of covariance permits the 
conclusion that students' aptitudes need not be a prime 
consideration in the determination of a particular technique 
of frequency of tests and feedback of test results to be used 
with beginning calculus students. It must be noted that this 
conclusion necessarily applies only to the very restricted 
(by college entrance and course enrollment requirements) 
aptitude range of the subjects in this study. 

The various levels of test frequency and feedback of test 
results did not result in significant differences in attitude, 
as measured by the Purdue master attitude scale employed in 
this study, nor were there significant differences among the 
proportions of students who "n'thdrew from the classes. 
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